Revaluation of a Large-Scale Thesaurus for Multi-media Indexing: An Experience Report

نویسندگان

  • Dirk Deridder
  • Peter Soetens
چکیده

In this paper we provide a preliminary overview of a number of problems we encountered when faced with the revaluation of a large-scale mono-lingual thesaurus. The thesaurus we speak of is used to wade through the vast multimedia archive of the Flemish public radio and television broadcaster (VRT). In order to support advanced and ‘knowledgeable’ queries on the archive, it became imperative to upgrade the existing infrastructure. In this context we performed an in-depth analysis of the existing legacy situation. This lead to the identification of a number of structural problems as well as problems with respect to content. Solutions to counter some of these have already been established. To support the new search-requirements for the archive, we have migrated the existing system to an ontology-inspired infrastructure.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

امکان‌سنجی طرح تدوین اصطلاح نامۀ مطالعات زنان و خانواده براساس استاندارد BS ISO 25964-1

Research Objective: Feasibility study of the Family and Women’s Studies Thesaurus considering the expansion of information in the field of women and family studies, as well as the wide span of related vocabulary and the development of vocabulary lists and bibliographies, the Family and Women’s Studies Thesaurus can be a professional tool for indexing and retrieval of women’s information in data...

متن کامل

English-Japanese Cross-lingual Query Expansion Using Random Indexing of Aligned Bilingual Text Data

Vector space models can be used for extracting semantically similar words from the co-occurrence statistics of words in large text data. In this paper, we report on our NTCIR 2002 experiments using the Random Indexing vector space method for extracting an English-Japanese cross-lingual thesaurus from aligned English-Japanese bilingual data. The crosslingual thesaurus has been used for automatic...

متن کامل

Alleviating Search Uncertainty Through Concept Associations: Automatic Indexing, Co-Occurrence Analysis, and Parallel Computing

In this article, we report research on an algorithmic apgather, process, and retrieve information. These systems proach to alleviating search uncertainty in a large inforprovide a wide variety of information and services, rangmation space. Grounded on object filtering, automatic ing from daily updates of foreign and national news, indexing, and co-occurrence analysis, we performed a movie revie...

متن کامل

Large-Scale Linguistic Ontology as a Basis for Text Categorization of Legislative Documents

The paper describes the structure and properties of a large linguistic ontology – a new kind of information retrieval thesaurus Thesaurus on Sociopolitical Life for Conceptual Indexing. The thesaurus is used in various realscale information-retrieval applications in the legal domain. At present one of the main applications of the Thesaurus is knowledge-based text categorization. Categories are ...

متن کامل

بررسی تطبیقی اصطلاح‌نامه معارف اسلامی و علوم قرآنی

This study examines the comparative strengths and weaknesses of the thesaurus and thesaurus Quranic teachings of the Koran. In today's society where the documents are kept electronically, retrieval and dissemination of information for the development of research, much greater importance of saving documents and thesaurus that is the basis for indexing in various sciences, One of the solutions fo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003